Method and Evaluation of Character Stroke Preservation on Handprint Recognition

نویسنده

  • Michael D. Garris
چکیده

A new technique for intelligent form removal has been developed along with a new method for evaluating its impact on optical character recognition. The form removal technique automatically detects the dominant lines in an image and erases them while preserving as much of the overlapping character strokes as possible. This method of form removal relaxes the recognition system’s dependence on rigid form design, printing, and reproduction by automatically detecting and removing some of the physical structures (lines) on the form. The line detection and removal technique operates on loosely defined zones in which no image deskewing is performed. The technique was tested on a large number of randomly-ordered handprinted lowercase alphabet fields, as these letters (especially those with descenders) frequently touch and extend through the line along which they are written. It is shown that intelligent form removal can improve lowercase recognition by as much as 3%, but this net increase in performance is insufficient to understand the impact on the recognition. There is expected to be trade-offs with the introduction of any new technique into a complex recognition system. A new statistical analysis was designed to evaluate the impact of intelligent line removal on optical character recognition. This evaluation method compares the statistical distributions of individual confusion pairs between two systems and automatically determines the significant improvements and the significant losses in performance. In order for system developers to continue to reduce error rates, sophisticated analyses like this become necessary to understand the real impact a modification has on recognition performance. For example, this method of evaluation should be very useful in squeezing higher performances out of voting systems. The statistical analysis presented in this paper was used to evaluate the new line removal technique and the results are reported.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Component-based handprint segmentation using adaptive writing style model

Building upon the utility of connected components, NIST has designed a new character segmentor based on statistically modeling the style of a person’s handwriting. Simple spatial features (the thickness of the pen stroke and the height of the handwriting) capture the characteristics of a particular writer’s style of handprint, enabling the new method to maintain a traditional character-level se...

متن کامل

NIST Form-Based Handprint Recognition System

The National Institute of Standards and Technology (NIST) has developed a new release of a standard reference form-based handprint recognition system for evaluating optical character recognition. As with the first release, NIST is making the new recognition system freely available to the general public on CD-ROM. This source code testbed, written entirely in C, contains both the original and th...

متن کامل

بازشناسی برخط حروف مجزای دست‌نویس فارسی بر اساس تشخیص گروه بدنه اصلی با استفاده از ماشین بردار پشتیبان

In this paper a new method for the online recognition of handwritten Persian characters has been proposed which uses a set of simple features and Support Vector Machine (SVM) as a classifier. The task of preprocessing allows us to equalize feature vectors from different characters. This algorithm is implemented in two steps. In the first step, input character is classified into one of eighteen ...

متن کامل

A Neural Approach to Concurrent Character Segmentation and Recognition

This paper presents a neural network solution that combines character segmentation and character recognition concurrently as a single task. Current segmentation methods utilize traditional image processing techniques such as spatial histograms which are only 60% accurate on handprint. Using traditional techniques for segmenting handprint in a model recognition system running on a massively para...

متن کامل

Off-line Handwriting Recognition from Forms

A public domain optical character recognition (OCR) system has been developed by the National Institute of Standards and Technology (NIST) to provide a baseline of performance on off-line handwriting recognition from forms. The system’s source code, training data, and performance assessment tools are all publicly available. The system recognizes the handprint written on Handwriting Sample Forms...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007